AITopics | ulcerative colitis

Collaborating Authors

ulcerative colitis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights

Kim, Hyunjae, Sohn, Jiwoong, Gilson, Aidan, Cochran-Caggiano, Nicholas, Applebaum, Serina, Jin, Heeju, Park, Seihee, Park, Yujin, Park, Jiyeong, Choi, Seoyoung, Contreras, Brittany Alexandra Herrera, Huang, Thomas, Yun, Jaehoon, Wei, Ethan F., Jiang, Roy, Colucci, Leah, Lai, Eric, Dave, Amisha, Guo, Tuo, Singer, Maxwell B., Koo, Yonghoe, Adelman, Ron A., Zou, James, Taylor, Andrew, Cohan, Arman, Xu, Hua, Chen, Qingyu

arXiv.org Artificial IntelligenceNov-11-2025

Large language models (LLMs) are transforming the landscape of medicine, yet two fundamental challenges persist: keeping up with rapidly evolving medical knowledge and providing verifiable, evidence-grounded reasoning. Retrieval-augmented generation (RAG) has been widely adopted to address these limitations by supplementing model outputs with retrieved evidence. However, whether RAG reliably achieves these goals remains unclear. Here, we present the most comprehensive expert evaluation of RAG in medicine to date. Eighteen medical experts contributed a total of 80,502 annotations, assessing 800 model outputs generated by GPT-4o and Llama-3.1-8B across 200 real-world patient and USMLE-style queries. We systematically decomposed the RAG pipeline into three components: (i) evidence retrieval (relevance of retrieved passages), (ii) evidence selection (accuracy of evidence usage), and (iii) response generation (factuality and completeness of outputs). Contrary to expectation, standard RAG often degraded performance: only 22% of top-16 passages were relevant, evidence selection remained weak (precision 41-43%, recall 27-49%), and factuality and completeness dropped by up to 6% and 5%, respectively, compared with non-RAG variants. Retrieval and evidence selection remain key failure points for the model, contributing to the overall performance drop. We further show that simple yet effective strategies, including evidence filtering and query reformulation, substantially mitigate these issues, improving performance on MedMCQA and MedXpertQA by up to 12% and 8.2%, respectively. These findings call for re-examining RAG's role in medicine and highlight the importance of stage-aware evaluation and deliberate system design for reliable medical LLM applications.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.06738

Country:

North America > United States (1.00)
Europe (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Lesion-Aware Visual-Language Fusion for Automated Image Captioning of Ulcerative Colitis Endoscopic Examinations

Escamilla, Alexis Ivan Lopez, Ochoa, Gilberto, Al, Sharib

arXiv.org Artificial IntelligenceSep-4-2025

We present a lesion-aware image captioning framework for ulcerative colitis (UC). The model integrates ResNet embeddings, Grad-CAM heatmaps, and CBAM-enhanced attention with a T5 decoder. Clinical metadata (MES score 0-3, vascular pattern, bleeding, erythema, friability, ulceration) is injected as natural-language prompts to guide caption generation. The system produces structured, interpretable descriptions aligned with clinical practice and provides MES classification and lesion tags. Compared with baselines, our approach improves caption quality and MES classification accuracy, supporting reliable endoscopic reporting.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.03011

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.98)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Diagnosis and Severity Assessment of Ulcerative Colitis using Self Supervised Learning

Margapuri, Venkat

arXiv.org Artificial IntelligenceDec-9-2024

Ulcerative Colitis (UC) is an incurable inflammatory bowel disease that leads to ulcers along the large intestine and rectum. The increase in the prevalence of UC coupled with gastrointestinal physician shortages stresses the healthcare system and limits the care UC patients receive. A colonoscopy is performed to diagnose UC and assess its severity based on the Mayo Endoscopic Score (MES). The MES ranges between zero and three, wherein zero indicates no inflammation and three indicates that the inflammation is markedly high. Artificial Intelligence (AI)-based neural network models, such as convolutional neural networks (CNNs) are capable of analyzing colonoscopies to diagnose and determine the severity of UC by modeling colonoscopy analysis as a multi-class classification problem. Prior research for AI-based UC diagnosis relies on supervised learning approaches that require large annotated datasets to train the CNNs. However, creating such datasets necessitates that domain experts invest a significant amount of time, rendering the process expensive and challenging. To address the challenge, this research employs self-supervised learning (SSL) frameworks that can efficiently train on unannotated datasets to analyze colonoscopies and, aid in diagnosing UC and its severity. A comparative analysis with supervised learning models shows that SSL frameworks, such as SwAV and SparK outperform supervised learning models on the LIMUC dataset, the largest publicly available annotated dataset of colonoscopy images for UC.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.07806

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Arges: Spatio-Temporal Transformer for Ulcerative Colitis Severity Assessment in Endoscopy Videos

Chaitanya, Krishna, Damasceno, Pablo F., Fadnavis, Shreyas, Mobadersany, Pooya, Parmar, Chaitanya, Scherer, Emily, Zemlianskaia, Natalia, Surace, Lindsey, Ghanem, Louis R., Cula, Oana Gabriela, Mansi, Tommaso, Standish, Kristopher

arXiv.org Artificial IntelligenceOct-1-2024

Accurate assessment of disease severity from endoscopy videos in ulcerative colitis (UC) is crucial for evaluating drug efficacy in clinical trials. Severity is often measured by the Mayo Endoscopic Subscore (MES) and Ulcerative Colitis Endoscopic Index of Severity (UCEIS) score. However, expert MES/UCEIS annotation is time-consuming and susceptible to inter-rater variability, factors addressable by automation. Automation attempts with frame-level labels face challenges in fully-supervised solutions due to the prevalence of video-level labels in clinical trials. CNN-based weakly-supervised models (WSL) with end-to-end (e2e) training lack generalization to new disease scores and ignore spatio-temporal information crucial for accurate scoring. To address these limitations, we propose "Arges", a deep learning framework that utilizes a transformer with positional encoding to incorporate spatio-temporal information from frame features to estimate disease severity scores in endoscopy video. Extracted features are derived from a foundation model (ArgesFM), pre-trained on a large diverse dataset from multiple clinical trials (61M frames, 3927 videos). We evaluate four UC disease severity scores, including MES and three UCEIS component scores. Test set evaluation indicates significant improvements, with F1 scores increasing by 4.1% for MES and 18.8%, 6.6%, 3.8% for the three UCEIS component scores compared to state-of-the-art methods. Prospective validation on previously unseen clinical trial data further demonstrates the model's successful generalization.

argesfm, clinical trial, video, (14 more...)

arXiv.org Artificial Intelligence

2410.00536

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Gastroenterology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

K-QA: A Real-World Medical Q&A Benchmark

Manes, Itay, Ronn, Naama, Cohen, David, Ber, Ran Ilan, Horowitz-Kugler, Zehavi, Stanovsky, Gabriel

arXiv.org Artificial IntelligenceJan-25-2024

Ensuring the accuracy of responses provided by large language models (LLMs) is crucial, particularly in clinical settings where incorrect information may directly impact patient health. To address this challenge, we construct K-QA, a dataset containing 1,212 patient questions originating from real-world conversations held on K Health (an AI-driven clinical platform). We employ a panel of in-house physicians to answer and manually decompose a subset of K-QA into self-contained statements. Additionally, we formulate two NLI-based evaluation metrics approximating recall and precision: (1) comprehensiveness, measuring the percentage of essential clinical information in the generated answer and (2) hallucination rate, measuring the number of statements from the physician-curated response contradicted by the LLM answer. Finally, we use K-QA along with these metrics to evaluate several state-of-the-art models, as well as the effect of in-context learning and medically-oriented augmented retrieval schemes developed by the authors. Our findings indicate that in-context learning improves the comprehensiveness of the models, and augmented retrieval is effective in reducing hallucinations. We make K-QA available to to the community to spur research into medically accurate NLP applications.

information, pregnancy, right lower abdominal pain, (15 more...)

arXiv.org Artificial Intelligence

2401.14493

Country:

South America > Brazil (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Indiana > Hamilton County > Carmel (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Gastrointestinal Disorder Detection with a Transformer Based Approach

Hosain, A. K. M. Salman, islam, Mynul, Mehedi, Md Humaion Kabir, Kabir, Irteza Enan, Khan, Zarin Tasnim

arXiv.org Artificial IntelligenceOct-6-2022

Accurate disease categorization using endoscopic images is a significant problem in Gastroenterology. This paper describes a technique for assisting medical diagnosis procedures and identifying gastrointestinal tract disorders based on the categorization of characteristics taken from endoscopic pictures using a vision transformer and transfer learning model. Vision transformer has shown very promising results on difficult image classification tasks. In this paper, we have suggested a vision transformer based approach to detect gastrointestianl diseases from wireless capsule endoscopy (WCE) curated images of colon with an accuracy of 95.63\%. We have compared this transformer based approach with pretrained convolutional neural network (CNN) model DenseNet201 and demonstrated that vision transformer surpassed DenseNet201 in various quantitative performance evaluation metrics.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IEMCON56893.2022.9946531

2210.03168

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Patch-level instance-group discrimination with pretext-invariant learning for colitis scoring

Xu, Ziang, Ali, Sharib, Gupta, Soumya, Leedham, Simon, East, James E, Rittscher, Jens

arXiv.org Artificial IntelligenceJul-11-2022

Inflammatory bowel disease (IBD), in particular ulcerative colitis (UC), is graded by endoscopists and this assessment is the basis for risk stratification and therapy monitoring. Presently, endoscopic characterisation is largely operator dependant leading to sometimes undesirable clinical outcomes for patients with IBD. We focus on the Mayo Endoscopic Scoring (MES) system which is widely used but requires the reliable identification of subtle changes in mucosal inflammation. Most existing deep learning classification methods cannot detect these fine-grained changes which make UC grading such a challenging task. In this work, we introduce a novel patch-level instance-group discrimination with pretext-invariant representation learning (PLD-PIRL) for self-supervised learning (SSL). Our experiments demonstrate both improved accuracy and robustness compared to the baseline supervised network and several state-of-the-art SSL methods. Compared to the baseline (ResNet50) supervised classification our proposed PLD-PIRL obtained an improvement of 4.75% on hold-out test data and 6.64% on unseen center test data for top-1 accuracy.

dataset, pld-pirl, representation, (16 more...)

arXiv.org Artificial Intelligence

2207.05192

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)

Genre: Research Report > Experimental Study (0.66)

Industry: Health & Medicine > Therapeutic Area > Gastroenterology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Class Distance Weighted Cross-Entropy Loss for Ulcerative Colitis Severity Estimation

Polat, Gorkem, Ergenc, Ilkay, Kani, Haluk Tarik, Alahdab, Yesim Ozen, Atug, Ozlen, Temizel, Alptekin

arXiv.org Artificial IntelligenceFeb-9-2022

Endoscopic Mayo score and Ulcerative Colitis Endoscopic Index of Severity are commonly used scoring systems for the assessment of endoscopic severity of ulcerative colitis. They are based on assigning a score in relation to the disease activity, which creates a rank among the levels, making it an ordinal regression problem. On the other hand, most studies use categorical cross-entropy loss function, which is not optimal for the ordinal regression problem, to train the deep learning models. In this study, we propose a novel loss function called class distance weighted cross-entropy (CDW-CE) that respects the order of the classes and takes the distance of the classes into account in calculation of cost. Experimental evaluations show that CDW-CE outperforms the conventional categorical cross-entropy and CORN framework, which is designed for the ordinal regression problems. In addition, CDW-CE does not require any modifications at the output layer and is compatible with the class activation map visualization techniques.

architecture, loss function, ulcerative colitis, (13 more...)

arXiv.org Artificial Intelligence

2202.05167

Country:

Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
North America > United States > Iowa > Johnson County > Iowa City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Therapeutic Area > Gastroenterology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic Estimation of Ulcerative Colitis Severity from Endoscopy Videos using Ordinal Multi-Instance Learning

Schwab, Evan, Cula, Gabriela Oana, Standish, Kristopher, Yip, Stephen S. F., Stojmirovic, Aleksandar, Ghanem, Louis, Chehoud, Christel

arXiv.org Artificial IntelligenceSep-29-2021

Ulcerative colitis (UC) is a chronic inflammatory bowel disease characterized by relapsing inflammation of the large intestine. The severity of UC is often represented by the Mayo Endoscopic Subscore (MES) which quantifies mucosal disease activity from endoscopy videos. In clinical trials, an endoscopy video is assigned an MES based upon the most severe disease activity observed in the video. For this reason, severe inflammation spread throughout the colon will receive the same MES as an otherwise healthy colon with severe inflammation restricted to a small, localized segment. Therefore, the extent of disease activity throughout the large intestine, and overall response to treatment, may not be completely captured by the MES. In this work, we aim to automatically estimate UC severity for each frame in an endoscopy video to provide a higher resolution assessment of disease activity throughout the colon. Because annotating severity at the frame-level is expensive, labor-intensive, and highly subjective, we propose a novel weakly supervised, ordinal classification method to estimate frame severity from video MES labels alone. Using clinical trial data, we first achieved 0.92 and 0.90 AUC for predicting mucosal healing and remission of UC, respectively. Then, for severity estimation, we demonstrate that our models achieve substantial Cohen's Kappa agreement with ground truth MES labels, comparable to the inter-rater agreement of expert clinicians. These findings indicate that our framework could serve as a foundation for novel clinical endpoints, based on a more localized scoring system, to better evaluate UC drug efficacy in clinical trials.

agreement, classification, video, (14 more...)

arXiv.org Artificial Intelligence

2109.14685

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

A machine learning approach identifies 5-ASA and ulcerative colitis as being linked with higher COVID-19 mortality in patients with IBD - Docwire News

#artificialintelligenceAug-14-2021, 18:25:55 GMT

Inflammatory bowel diseases (IBD), namely Crohn's disease (CD) and ulcerative colitis (UC) are chronic inflammation within the gastrointestinal tract. IBD patient conditions and treatments, such as with immunosuppressants, may result in a higher risk of viral and bacterial infection and more severe outcomes of infections. The effect of the clinical and demographic factors on the prognosis of COVID-19 among IBD patients is still a significant area of investigation. The lack of available data on a large set of COVID-19 infected IBD patients has hindered progress. To circumvent this lack of large patient data, we present a random sampling approach to generate clinical COVID-19 outcomes (outpatient management, hospitalized and recovered, and hospitalized and deceased) on 20,000 IBD patients modeled on reported summary statistics obtained from the Surveillance Epidemiology of Coronavirus Under Research Exclusion (SECURE-IBD), an international database to monitor and report on outcomes of COVID-19 occurring in IBD patients.

higher covid-19 mortality, ibd patient, ulcerative colitis, (7 more...)

#artificialintelligence

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback